Language design for distributed stream processing
نویسنده
چکیده
Applications that combine live data streams with embedded, parallel, and distributed processing are becoming more commonplace. WaveScript is a domain-specific language that brings high-level, type-safe, garbage-collected programming to these domains. This is made possible by three primary implementation techniques, each of which leverages characteristics of the streaming domain. First, WaveScript employs an evaluation strategy that uses a combination of interpretation and reification to partially evaluate programs into stream dataflow graphs. Second, we use profile-driven compilation to enable many optimizations that are normally only available in the synchronous (rather than asynchronous) dataflow domain. Finally, an empirical, profile-driven approach also allows us to compute practical partitions of dataflow graphs, spreading them across embedded nodes and more powerful servers. We have used our language to build and deploy applications, including a sensor-network for the acoustic localization of wild animals such as the Yellow-Bellied marmot. We evaluate WaveScript’s performance on this application, showing that it yields good performance on both embedded and desktop-class machines. Our language allowed us to implement the application rapidly, while outperforming a previous C implementation by over 35%, using fewer than half the lines of code. We evaluate the contribution of our optimizations to this success. We also evaluate WaveScript’s ability to extract parallelism from this and other applications. Thesis Supervisor: Samuel Madden Title: Associate Professor Thesis Supervisor: Arvind Title: Johnson Professor
منابع مشابه
Change-Resilient Design and Dataflow Optimization for Distributed XML Stream Processors
We propose a new stream-processing framework based on a virtual assembly line (val) model. We instantiate the val framework obtaining ∆-XML, an approach for designing and optimizing distributed XML processing pipelines. val/∆-XML greatly simplifies the design of change-resilient dataflow pipelines: XML processors (called actors) can be inserted, deleted, and their “scope of work” (the parts of ...
متن کاملStream Processing on the Grid: an Array Stream Transforming Language
Specific requirements of stream processing on the Grid are discussed. We argue that when the stream processing paradigm is used for cluster computing, the processing components can be coded in the form of data-parallel recurrence relations with stream synchronization and filtering at the interfaces. We propose a programming language ASTL in which such components can be written and describe some...
متن کاملDistributed S-Net
S-NET is a declarative coordination language and component technology primarily aimed at modern multicore/many-core chip architectures. It builds on the concept of stream processing to structure dynamically evolving networks of communicating asynchronous components, which themselves are implemented using a conventional language suitable for the application domain. We sketch out the design and i...
متن کاملA Comprehension-Based Database Language and Its Distributed Execution
This paper describes a way to noticeably reduce the description cost of database operations executed in distributed computing environments by design of a novel declarative language to describe database operations, development of program transformation techniques to improve e ciency at execution time, and clari cation of prerequisites to execute the programs in distributed computing environments...
متن کاملVerteilung globaler Anfragen auf heterogene Stromverarbeitungssysteme
Deployment of Global Queries in Distributed and Heterogeneous StreamProcessing Systems Distributed in-network stream processing is more efficient than sending all data to a central processing unit. In the past few years Stream-Processing Systems (SPSs) have established themselves as an interesting alternative to database systems for continuous query processing. There are many scenarios having w...
متن کاملResearch Statement Robert Soulé
With my research, I want to simplify the development of complex systems through the use of programming language technologies. Most of my work has focused on distributed stream processing [2, 3, 6, 7, 8], but I have also explored distributed storage systems [1], and security in peer-to-peer content distribution networks [4]. My methodology is to start by developing a formal model, and then to us...
متن کامل